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ABSTRACT 


The high level of pollutants in the surrounding air in 2016-2017 deteriorated the air quality in Delhi at an alarming rate. Future air quality was predicted by analyzing 
our previous study and we analyzed the data. Forecasting urban air pollution becomes an essential alternative to reduce its harmful consequences. Several machine 
learning technologies have been adopted Air quality forecasts. In this document, we implement various classification and regression techniques in linear form 
Regression, ODD regression, random forest regression, Decision tree regression, vector regression support, Artificial neural networks and pulse, gradient regression 
Adaptive pulse regression for air quality index prediction Among the main pollutants are PM2.5, PM10, CO, NO2, SO2 and O3. The techniques are then evaluated 
using the RMS error, mean absolute error and R2, indicating the support vector regression and artificial neural networks are best suited expect New Delhi air quality. 
The air quality in the Indian capital Delhi has been severe in recent years. A big number people diagnosed with asthma and other breathing problems. The main reason 
behind this the high concentration of lethal PM2.5 particles dissolved in the atmosphere. Good model predicting the level of concentration of these dissolved particles 
can help better prepare the population for prevention and safety strategies to save them from many health related diseases. This work aims to predict PM2.5 
concentration levels in different areas of Delhi by hour, with time series analysis applied slope, based on various atmospheric and surface factors, such as wind speed 
and atmospheric temperature, Pressure, etc. Analysis data is obtained from various weather monitoring sites previously installed in the city Indian Meteorological 
Department (IMD). A regression model has been proposed which uses an additional tree regression AdaBoost, to promote more. Pilot the comparative study with the 





most recent work and results indicates the efficiency of the proposed model. 
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INTRODUCTION: 

Air is a mixture of various organic gases necessary to maintain life. However, 
many factors such as deforestation, modernization, industrialization, vehicle 
emissions and super population explosion contributes to polluting the air by 
destroying various harmful gases such as air Nitrogen dioxide (No2), sulfur diox- 
ide (SO2), lead (Pb), carbon monoxide (CO), ozone (O3). Many factors contrib- 
ute to pollution including straw which burns with hazardous particles Such as 
PM2.5 and PM10. These particles are mainly Composed of small solid and liquid 
particles suspended in air with various chemical structures including some 
organic compounds like S02-4-, NO3 - etc. The main and most dangerous com- 
ponent of these pollutants particles are PM2.5 particles, as the name itself sug- 
gests. Atmospheric particles (PM) less than 2.5diameters, about 3% of the diam- 
eter ofa human hair. Concentrations of PM2.5 itis measured in p.g/ m3. These par- 
ticles are very dangerous for health and can easily penetrate deep into the lungs, 
irritate and corrode the alveolar wall and, as a result, compromise lung functions. 
The negative effect of PM2.5 is not limited only to asthma, Inflammation, 
impaired lung function, various diseases but can also cause cancer. These fine par- 
ticles, if penetration into the lung may supplement the severity of COVID-19 
infection because the new coronovirus also attacks the respiratory system. If the 
concentration of these polluting particles is very high, environment severely 
affects our health and can cause death or Problems ina short period of time. Stud- 
ies have established it particulate matter also affects human health at the genetic 
level .The work proposed in this article considers air pollution most killed in win- 
ter was Delhi data, for use, it is collected by the Central Pollution Control Board. 


Causes of air pollution: 
Some of the main causes of air pollution are discussed below. 


¢ Industrial exhaust: Emissions of harmful gases such as sulfur dioxide and 
nitrogen oxides from thermal power plants in Rajghat, Badarpur, 
Indraprastha and other industrial areas add to the main air pollutants in 
Delhi. 


¢ Vehicle emissions: Traffic congestion and vehicle emissions significantly 
contribute to the deterioration of air quality in Delhi. Data viewed by the 
Delhi Government Ministry of Transport as of December 31, 2016 puts the 
total number of registered vehicles is 1.06.791. The greatest number of 
vehicles registered in the city is scooters and scooters, and their number is 
63.40136. These are great Factors contributing to air pollution. 


¢ Burning of agricultural waste in Punjab and Haryana. Farmers in Punjab 
and Haryana burn their rice crop residues to quickly prepare their fields for 
wheat crops . 


* Construction and demolition: Constant construction and demolition helps 
increase the level of dust particles problems are in the air and therefore 


considered dangerous . 


* Other factors: Some of the factors that can indirectly lead to the deteriora- 
tion of air quality are overcrowding, road dust, Diwali breaking the smoke 
etc. 


The major concentrations of air pollution in Delhi are: - 


1. Particular Matter, RSPM, SPM (PM2.5, PM10): The main source of par- 
ticles in Delhi Vehicle emissions, especially heavy diesel vehicles, road 
dust, thermal power plants, residential combustion processes. The particles 
in the air (PM 2.5) are overestimated it is more dangerous to human health 
than PM10. The average PM2.5 pollution limit is 60 micrograms per cubic 
meter, but the PM level of 2.5 is more than 300 micrograms per cubic meter 
inall parts of Delhi. 


2. Nitrogen oxides (Nox): Nitrogen oxides are produced in industrial combus- 
tion processes and mainly in form exhaust vehicles. NOx levels are highest 
in urban areas due to traffic. This is an important factor production of photo- 
chemical fumes that cover the air in the city like a blanket. There are such det- 
rimental effects respiratory problems in adults and children. 


3. Sulfur Dioxide (So2): Formed mainly by burning fossil fuels, especially 
thermal power plants. This pollution is a source of acid rain, which adversely 
affects the function of the lungs. 


4. Benzene: The major sources of benzene are from vehicle exhaust gases and 
other industrial processes and industrial solvent. Benzene is a component of 
crude oil and petrol. Evaporation along with vehicle evacuation petrol sta- 
tions can increase the levels of benzene. 


5. Ozone (O03): Formed by the chemical reaction of volatile organic com- 
pounds and nitrogen dioxide presence of Sunlight, so the ozone level is 
higher in summer. Groundwater ozone also contributes to the formation pho- 
tochemical smoke. 


6. Toluene: Toluene is another volatile industrial solvent that can cause short- 
term exposure to eye irritation respiratory tract. This substance is a known 
cancer, which also affects the central nervous system. 


7. Carbon monoxide (CO): CO is a toxic air pollutant caused by incomplete 
combustion of carbon content fuels. One of the main reasons is the rejection 
of the vehicle and the deterioration of the engine of the vehicle. 


Air quality monitoring in Delhi: 
Air pollution monitoring is carried out in Delhi manual ambient air quality moni- 
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toring station (CAAQM). Based on National Air Quality Monitoring Program 
(NAMP) [15] of Centeral Pollution Control Board (CPCB), manual monitoring 
of air pollution conducted in Sarojini Nagar, ChandniChowk, Mayapuri Indus- 
trial Zone, Pitampura, Shahadra, Shahzada Bagh, Nizamuddin, Janakpuri, Fort 
Siri, and ITO throughout Delhi. In addition to manual air monitoring stations, 
Continuous air quality monitoring was also carried out in 11 locations, viz. 
Anand Vihar, Civil Line, DCE, Dilshad Park, Dwarka, IGI Airport, ITO, Mandir 
Marg, Punjabi Bagh, R.K. Puram and Shadipur. Card with everything the Delhi 
monitoring station is show in fig 1. where it is dark the circled station (R. K. 
Puram) was used for the study in the model. 
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Figure 1: Map of air quality monitoring 











Related work: 

In recent years, especially metropolitan cities in the world is experiencing pollu- 
tion levels that violate all international standards [1, 2] which caused many life- 
threatening problems. Even if there is many factors cause health problems, 
PM2.5 is one of them important particles that are responsible for that. Danger of 
death the impact of PM2.5 particles caught the attention of researchers this is a 
question about proposing a suitable model for predicting PM2.5 levels in pol- 
luted air. Several models have explored this area to measure contaminated parti- 
cles level in the air. Time series analysis of historical atmospheric data and fur- 
ther regression of this data is at the heart of these templates. The main model for 
measuring pollution levels is based on statistical methods including Kalman [3] 
and single screening linear regression variable [4]. However, this failed resulting 
ina good level of accuracy. This started a trend using machines learning and neu- 
ral network based approach [5] for prediction PM2.5 because it can easily con- 
sider several attributes at the same time. Models such as non-linear regression [6] 
and neural networks regression greatly increase accuracy. However, in this 
model, attach importance to the preceding value dependence of this PM2.5 really 
miss. Then, when the components of the time series are combined with existing 
models based on machine learning (ML), the level of precision the measurement 
is sufficiently improved. 


Methods such as Multilayer Perceptron Regression [7] and regression tree-based 
methods [8] such as decision tree regression [9], Random Forest Regression 
[10], Lasso, etc. I am in the first place this analysis. Plus, for even greater accu- 
racy, improvement techniques are also incorporated into existing models good 
example is XGBoost [11].A study on the prediction of air pollution, through a 
machine learning approach, was produced by Guan &Sinnott [12]. In this case, 
they offerLong-term memory network (LSTM) on air pollution data based in 
Melbourne, Australia. It should be noted that the LSTM network is able to detect 
the concentration of PM2.5 in the air quite significantly. There are several 
machine learning based models available for PM2.5 prediction by Joarestani et 
al. [13]. In this case, they implemented XGBoost, Random Forests and deep 
learning on multi-source remote sensing data to predict PM2.5 particulate matter 
in the urban areas of Tehran, Iran. It is observed That XGBoost is a more efficient 
model than the other two in terms of R2-Score, MAE and RMSE [14]. 


Some improvement techniques, for eg. AdaBoost is often used forimprove the 
quality of the results produced by different machine learning models. There are 
many use cases for estimating time series assisted by forecasting boosting tech- 
niques. Model based on a global approach [15] used the increase in time series 
forecasts for food crops quality results. Xiao et al. [16] AdaBoost combined with 
LSTM (Long Short-Term Memory) for the sea surface temperature forecasting. 
Improved Gradient Decision Tree Algorithm, based on the Kalman filter, it was 
introduced by Li etal [17]. Be improved LSTM is used for Internet traffic predic- 
tion by Bian et al. [18]. Increasing gradients is also used to increase performance 
the delay-based tank treatment system of Tao et al. [19]. AdaBoost combined 
with SVM for classification of time series signals in patients with epilepsy Diag- 
nosis of seizures by Hadeethi et al. [20]. 


An additional classifier and tree regression also found a zonevarious applications 
in various fields. Li et al. [21] More trees are stacked with LSTM for the predic- 
tion of the dam displacement time series. John et al. used an extra tree regression 
for real-time path estimation [22]. Extra trees have produced commendable 
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results in forecasting daily flows furthermore, as suggested by Tyrallis et al. 
[23].The proposed work is an attempt to accurately predict PM2.5 level and to 
improve the accuracy of forecasts, especially in the atmosphere of Delhi. A 
model for this is proposed, based on Extra-Trees-Regressor[24] improved with 
Ada Boost [25]. Extra-Trees is a very casual tree set technique both the choice of 
the interception and the attributes involved separate tree nodes. It is used for 
supervised classification but can be extended to regression problems [39]. 
AdaBoost, stands for adaptive boosting, is a stimulation algorithm used in con- 
junction with learning algorithm to complete its performance [26, 27]. There are 
a number of air quality prediction models to evaluate and predict the pollutant 
concentrations in urban areas. Traditionally statistical models and numerical 
models include chemical transfer and atmospheric dispersion models were used 
for the prediction. Recently machine learning methods have become the main 
techniques used air quality forecasting models. 


A. Statistical model: 

The statistical model is based on the approach using historical data for 
learning and its experience predicting the future behavior of the variable of 
interest. These model provides very high accuracy. Some notable statisti- 
cal model used for aerial forecasting quality uses multiple linear regres- 
sion and autoregressive moving average (ARMA) [28]-[29]. But because 
of their incompetence to take into account the dynamic behaviour of mete- 
orological parameters they are unable to estimate the exposed levels accu- 
rately. 


B. Numerical Models: 
Numerical method generally use mathematical formulas simulates atmo- 
spheric processes and predicts air quality. HIWAY2 (US EPA) [30] and 
CALINE4 (California) 


Ministry of Transport) [31] is a distributed model based on the Gaussian 
plume model. For these models it is used in particular to predict vehicle pol- 
lution. Another type of digital model is the "chemical transfer" model that 
maps physical and chemical changes to the concentration of pollutants 
using the atmosphere Formula. Meteorological research and forecasts a 
model combined with chemistry, WRF-CHEM, is one models that have 
been used to predict ozone concentration in Shanghai, China [32]. In some 
other studies [33] - [34] he also emphasized the use of other chemical 
transfer models like community multiscale model for air quality (CMAQ) 
and complete air quality model with extensions(CAMx) to predict concen- 
trations of pollutants. But these are model cannot map and trust the physics 
of pollutants therefore; the simplest assumptions are not suitable in the 
short term prediction that often fluctuate greatly. 


C. Machine Learning Models: 

Artificial intelligence thanks to technological advances based algorithms 
are widely used for prediction for the purpose of forecasting air quality. 
Auto learning approach takes into account certain parameters prediction, 
unlike a pure statistical model. Artificial Neural Network (ANN) seems to 
be the most used Air quality forecasting method [35] - [36]. Other studies 
have shown the use of hybrid or mixed models a neural network based 
model for prediction. Artificial Smart algorithms such as fuzzy logic and 
genetics algorithm, Principal Component Analysis (PCA) along with 
ANNs have been used in the design of models such as ANFIS (Adaptive 
euro Fuzzy Interface System) model [37], PCAANN models [38] - [39] 
etc. Other machine learning models contains the created support vector 
Machine Based Model (SVM) [40], PCA-SVM [41] and many others. 
Modified wavelet technique and Back PropagationNeural Network (W- 
BPNN) [42] Here Back propagation neural network Wavelet transforma- 
tion technology is also implemented to predict the concentrations of SO2, 
NO2 and PM10. Another study conducted in Quito, Ecuador [43] used six 
weather factors to predict the concentration of PM2.5. K. Hu et al., 
designed the machine learning model Haziest for predict air quality. Here 
it was the first system evaluates using 7 different regression models and 
finally SVR was selected as the final forecast model. Similarly the 
research was conducted in Gauteng, South Africa. [44] Prediction of sur- 
face ozone concentration using ANN and multiple linear regression tech- 
niques. Another efficient machine learning method used is Extreme Learn- 
ing machine (ELM), which is a non-linear machine [45] Learning algo- 
rithm. Here, the randomized neural network used to predict the concentra- 
tions of O03, NO2, and PM2.5 based on these nonlinear techniques using 
data from 6 stations It has spread across Canada. 


METHODOLOGY: 
Five-step procedure for estimating air quality continues as shown in Figure 1. 
The detailed process is as follows: 


Explained below 


A. Data Collection: 

1) Site Description: New Delhi (28.61°N77.23°E), the capital of India is 
located on the Yamuna Plain having elevations vary from 650 feet to 820 
feet across town. It is a land locked in nature replaces toxic air with rela- 
tively clean air From the sea by the sea breeze. Fast growing too adjacent, 
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residential, commercial and industrial areas also make flushing difficult 
contaminated air, which increases pollution in the city center. The cli- 
mate of New Delhi is a humid climate influenced by the monsoon sub- 
tropical climate with annual precipitation most of the 700mm are during 
the monsoon season, It will be extended from mid-June to August [46]. 
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Figure 1.1 Snapshot of Dataset used Nidhi sharmaa et al [51] 
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Fig 2. Process for estimating air quality Chavi Srivastava et al [1] 


2) Data Source: Pollutants for this study information from much air view- 
ing sites it will be considered. They were R.K. puram, the Punjabi Bagh, 
Anand Vihar [47] described in Figure 2. These observations place is 
located in the most polluted area is the reason for choosing these places is 
Simple and uncomplicated in classifying contaminants Common infor- 
mation for New Delhi city, called CO, NO2, SO2, 03, PM2.5, PM10col- 
lected from Central Pollution control Board (CPCB) site with "Air and 
noise" "Monitoring system" designed to collect pollution concentra- 
tions. This system has many desks Noise position sensor, Wi-Fi module 
to send information to the cloud, SD card for storing data on the device 
itself. The records are cloud storage on the ThingSpeakloT platform to 
anyone can see it. Information on material impacts temperature, wind 
direction, wet humidity, wind, and more fast, etc. also brought from 
above source. Records have been collected since January 2016 upgrade 
every 4 hours until September 2017 


Results (see Table I). 
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Fig 3. The pollution monitoring station selected for study in New Delhi 





B. Data Pre-processing: 
Data Refinement: The data to be analyzed was adjusted by removing 
instances with missing values in input parameters. Missing values at target 
object, i.e. the pollutant is estimated using an imputation function interpo- 
late. The strategy used here for the estimate is the average. 


Data Transformation: Before normalizing the dataset all parameters are 
transformed for easy calculations. Therefore, the input parameter is the wind 
direction, which is expressed in degrees has been converted to wind direc- 
tion Index (dimensionless). The CPCB (Central Pollution Control Board) 
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uses it National air quality standards prescribed for indication of the concen- 
tration of various pollutants in India [1]. Even in case three, for example. H. 
CO, NO 2, SO 2 and O 3 gases the AQI is calculated for the gases and the 
maximum below these are selected for a specific instance for analysis goal. 





TABLE I. DATASET USED IN THE EXPERIMENT 





Number 
Station of Input parameters 
instances 
R.K.Puram 3489 RH, Temp, WS,VWS, Prev AQI 
Punjabi Bagh 3451 RH, Temp, WS,VWS, Prev AQI, WD 
AnandVihar 3448 RH, Temp, WS.WD, Prev AQI 
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Nidhi Sharmaa et al [51] 











Data Normalization: If the input consists ofhaving many attributes with dif- 
ferent units is essential scale these attributes to a specific area to make any- 
thing possible attributes have the same weight. This ensures that there is a 
minor a meaningful account that could have a broader scoperemove a per- 
haps more important attributes. Chavi Srivastava et al [2] 


Feature Selection: 

Feature selection is the process of selecting a subset of initial characteristics con- 
taining relevant information predicts the output data. In case of redundant data, 
function extraction is used. Feature extraction includes selection of optimal input 
parameters for the selected input dataset. The resulting reduced data set is used to 
Analysis. The maximum number of entries available for analysis is six, so all 
inputs are selected for calculations. 


Training the Model: 

The regression techniques are mentioned in Section III-B, they are implemented 
using Python and Scikitlearn programming It's like an open source machine 
learning library [49]. Anaconda Navigator v5.1, open source Python Data Sci- 
ence platform is used for entry Jupyterl Python Notebook (open source Python 
editor) for Programming in Python. There are three cases for each case station - 
first case for AQT from PM2.5, second case - AQI from PM10 and the last case 
AQI gas. That's why there is a total nine sets of training data, of which eight have 
been trained each regression model. Figure 3 shows a comparison estimated val- 
ues and values use eight-way regression standard AQI templates from PM2.5 to 
R.K. Puram Station. Similar results were obtained for the other eight cases. 


RESULT AND DISCUSSION: 

Productive judgment is essential to assess suitability predictive model. After the 
model is created, the metrics are used get feedback and make necessary changes 
until a desired accuracy is achieved or there are no further improvements possi- 
ble metrics. Hence the evaluation of the previous model important for improving 
the performance of test datasets. [50] Various statistical metrics are used for the 
evaluation Model depending on the design of the model, its designated task, etc. 
We use Mean Square Error (MSE), Mean Absolute error (MAE) and R2 to evalu- 
ate the regression Techniques for creating models. The performance of models 
for each case in R. K. Puram, Punjabi Bagh and AnandVihar is shown in Table II, 
Table III and Table IV all. The results are favorable as an adaptation of the 
modelvaries from fair to good. From Table II we can see this for R. K. Puram 
Monitoring Station, DTR and SVR MLP provides the lowest estimation error, 
while the GBR technique offers maximum accuracy with a relatively small error 
range from Table III it can be concluded that for Punjabi Bagh Monitoring Sta- 
tion, MLP gave the fewest errors estimates and gives arather low maximum accu- 
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racy different errors. From Table IV we can conclude that for the AnandVihar powerNetworking (MLP) is best for our purposes. Result procurement illustrates 
SVR Monitoring Station reports the fewest errors estimates and gives a rather the benefits of loT integration and big data analysis with machine learning. 
low maximum accuracy different errors. Then consider overall, SVR and neural Chavi Srivastava et al [1] 
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Figure 4: samples and output Chavi Srivastava et al [1] 








TABLE II. Estncation accuRACY FOR STATION 1- R. K. PURAM 








Pollutant PM 2.5 PM 10 OyNOYCO’SO; 
Parameter MSE MAE Rr MSE MAE Rr MSE MAE Rr 
LR 0.3434 0.42805 0.65646 0.4837 0,44082 0.461 0.5870 0.56640 0.30026 
SGD 0.3186 0.44677 0.65922 0.5214 0.41981 0.41984 0.6401 0.54677 0.23699 
RFR 041 0.40 0.67 0.4589 0.43030 0.48940 0.5901 0.55474 0.40545 
DTR 0.20 0.43 0.62 0.4632 0.44618 0.48461 0.5847 0.56899 0.41096 
MLP 0.2797 0.3747 0.69275 0.4129 0.39769 0.31049 OS111 0.50353 0.48502 
SVR 0.29467 0.3627 0.68478 0.5862 0.42779 0.34772 0.5177 0.48160 0.47837 
GBR 0.2764 0.36642 0.69647 0.4506 0.41905 0.49858 0.277 0.50117 0.48841 
ABR 0.4650 0.42805 0.69275 0.6197 0.61545 0.31049 1.2550 0.9879 0.2643 
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TABLE III. Estacation accURACY FOR STATION 2- PUNJABI BAGH 
Pollutant PM2.5 PM 10 O,NO,CO'SO, 
Parameter MSE MAE rd MSE MAE R MSE MAE Rr 
LR 0.3081 0.41320 0.6839] 0.5049 0.42837 0.59798 0.7676 0.58008 0.26773 
SGD 0.3302 0.42952 0.66128 0.6448 0.44080 0.48669 0.$355 0.55218 0.20291 
RFR O3121 0.41496 0.67983 0.4775 0.41039 0.61982 0.7695 0.58196 0.26584 
DIR 0.3314 0.43264 0.66006 0.4722 0.43316 0.62403 0.6471 0.55851 0.38261 
MLP 0.2856 0.39566 0.76760 0.4667 0.40402 0.62843 0.6456 0.51148 0.38410 
SVR 0.3192 0.3955] 0.67245 0.4205 0.37312 0.66513 0.6712 0.47173 0.35962 
GBR 0.2799 0.39422 0.71286 0.4503 0.38574 0.64147 0.655] 0.51527 0.37001 
ABR 0.3762 O.S1S84 0.61406 0.8883 0.76612 0.29271 1.5953 1.09333 AS52119 
TABLE IV. Estmatton ACCURACY FOR STATION 3- ANAND VIHAR 
Pollutant PM 2.5 PM 10 Oy NOY CO'SO; 
Parameter MSE MAE Rr MSE MAE Rr MSE MAE gr 
LR 0.5196 0.54908 0.49129 0.4149 0.45045 0.$1443 0.6006 0.56644 0.36483 
SGD 0.5667 0.9122 0.44512 0.4139 0.44560 0.1848 0.6852 0.60091 0.3070 
RFR 0.4664 0.41264 0.54333 0.3973 0.30990 0.62453 0.4687 0.44808 0.40170 
DIR 0.4123 0.44137 0.49852 0.4283 0.43041 0.59008 0.6149 0.458616 0.3497] 
MLP 0.3976 0.46062 0.61067 0.43458 0.40011 0.49472 O444] 0.44381 0.41294 
SVR 0.4054 0.46004 0.60323 0.4393 0.390064 0.48469 0.4429 0.42487 0.41417 
GBR 0.4087 0.47410 0.59986 0.4398 0.39929 0.58524 0.8421 0.$4177 0.42867 
ABR 0.6390 0.64212 0.37439 0.8687 0.81283 0.18082 0.9216 0.77826 0.02534 














Our final conclusion is with the help of the above apply machine learning tech- 


ful for the authorities needed for adequate consumption actions and provision of 
niques where we can predict air quality index. This information will become use- 


information to the general public such as Safety and precautions.[1] 














Future scope: 

The dataset used in this study is shorter which limits the capabilities of the model. 
Hence the use of data durable records with irreversible data gaps recommended 
for more improvisation. For future work, we can introduce more weather factors 
such as precipitation, minimum and maximum temperatures, sun radiation, 
vapor pressure, etc. to improve accuracy system. Unclear trends and huge fluctu- 
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ations in the air pollutants are also associated with emissions from pollution 
resources such as transport, industrial emissions, etc. factors must also be taken 
into account 


Acknowledgment: 
The author would like to thank Central PollutionControl board in Delhi to pro- 


Research Paper E-ISSN No : 2454-9916 | Volume: 7 | Issue: 4 | Apr 2021 


vide data on pollutants namely CO, NO2, 03, SO2, PM2.5, PM10 and those that 
affect factors such as wind speed, wind direction, temperature, etc. 
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